AITopics | audio captcha

Collaborating Authors

audio captcha

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Aura-CAPTCHA: A Reinforcement Learning and GAN-Enhanced Multi-Modal CAPTCHA System

Chandra, Joydeep, Manhas, Prabal, Kaur, Ramanjot, Sahay, Rashi

arXiv.org Artificial IntelligenceAug-22-2025

Aura-CAPTCHA was developed as a multi-modal CAPTCHA system to address vulnerabilities in traditional methods that are increasingly bypassed by AI technologies, such as Optical Character Recognition (OCR) and adversarial image processing. The design integrated Generative Adversarial Networks (GANs) for generating dynamic image challenges, Reinforcement Learning (RL) for adaptive difficulty tuning, and Large Language Models (LLMs) for creating text and audio prompts. Visual challenges included 3x3 grid selections with at least three correct images, while audio challenges combined randomized numbers and words into a single task. RL adjusted difficulty based on incorrect attempts, response time, and suspicious user behavior. Evaluations on real-world traffic demonstrated a 92% human success rate and a 10% bot bypass rate, significantly outperforming existing CAPTCHA systems. The system provided a robust and scalable approach for securing online applications while remaining accessible to users, addressing gaps highlighted in previous research.

large language model, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2508.14976

Country: Asia > India (0.47)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Add feedback

Moravec's Paradox: Towards an Auditory Turing Test

Noever, David, McKee, Forrest

arXiv.org Artificial IntelligenceAug-1-2025

This research work demonstrate s that current AI systems fail catastrophically on auditory tasks that humans perform effortlessly. Drawing inspiration from Moravec's paradox ( i.e., tasks simple for humans often prove difficult for machines, and vice vers a), we introduce a n auditory Turing test comprising 917 challenges across seven categories: overlapping speech, speech in noise, temporal distortion, spatial audio, coffee - shop noise, phone distortion, and perceptual illusions. Our evaluation of state - of - the - art audio models including GPT - 4's audio capabilities and OpenAI's Whisper reveals a striking failure rate exceeding 93%, with even the best - performing model achieving only 6.9% accuracy on tasks that humans solve d at 7.5 times higher success (52%). These results expose focusing failures in how AI systems process complex auditory scenes, particularly in selective attention, noise robustness, and contextual adaptation. Our benchmark not only quantifies the human - machine auditory gap but also provides insights into why these failures occur, su ggesting that current architectures lack fundamental mechanisms for human - like auditory scene analysis. The traditional design of audio CAPTCHAs highlight s common filters that humans evolved but machines fail to select in multimodal language models. This work establishes a diagnostic framework for measuring progress toward human - level machine listening and highlights the need for novel approaches integrating selective attention, physics - based audio understanding, and context - aware perception into mult imodal AI systems. Artificial intelligence has made great strides in language understanding and multimodal perception, yet machines still struggle with basic auditory tasks that humans perform successfully [1 - 20] . A striking example is the cocktail party effect [21 - 22 ] - the human ability to focus on a single conversation in a noisy room - which remains a formidable challenge for AI.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.23091

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.95)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Breaking Audio CAPTCHAs

Neural Information Processing SystemsApr-6-2023, 14:12:34 GMT

CAP T C H A s are computer-generated tests that humans can pass but current computer systems cannot. CAP T C H A s provide a method for automatically distinguishing a human from a computer program, and therefore can protect Web services from abuse by so-called "bots." Most CAP T C H A s consist of distorted images, usually text, for which a user must provide some description. Unfortunately, visual CAP T C H A s limit access to the millions of visually impaired people using the Web. Audio CAP T C H A s were created to solve this accessibility issue; however, the security of audio CAP T C H A s was never formally tested.

captcha, classifier, digit, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Audio Captcha Recognition Using RastaPLP Features by SVM

Cakmak, Ahmet Faruk, Balcilar, Muhammet

arXiv.org Machine LearningJan-7-2019

Nowadays, CAPTCHAs are computer generated tests that human can pass but current computer systems can not. They have common usage in various web services in order to be able to detect a human from computer programs autonomously. In this way, owners can protect their web services from bots. In addition to visual CAPTCHAs which consist of distorted images, mostly test images, that a user must write some description about that image, there are a significant amount of audio CAPTCHAs as well. Briefly, audio CAPTCHAs are sound files which consist of human sound under heavy noise where the speaker pronounces a bunch of digits consecutively. Generally, in those sound files, there are some periodic and non-periodic noises to get difficult to recognize them with a program but not for a human listener. We gathered numerous randomly collected audio file to train and then test them using our SVM algorithm to be able to extract digits out of each conversation.

algorithm, captcha, digit, (14 more...)

arXiv.org Machine Learning

1901.02153

Country:

North America > United States > Tennessee > Rutherford County > Murfreesboro (0.04)
Europe > France (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.30)

Add feedback

Breaking Audio CAPTCHAs

Tam, Jennifer, Simsa, Jiri, Hyde, Sean, Ahn, Luis V.

Neural Information Processing SystemsDec-31-2009

CAPTCHAs are computer-generated tests that humans can pass but current computer systems cannot. CAPTCHAs provide a method for automatically distinguishing a human from a computer program, and therefore can protect Web services from abuse by so-called "bots." Most CAPTCHAs consist of distorted images, usually text, for which a user must provide some description. Unfortunately, visual CAPTCHAs limit access to the millions of visually impaired people using the Web. Audio CAPTCHAs were created to solve this accessibility issue; however, the security of audio CAPTCHAs was never formally tested.

artificial intelligence, captcha, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
North America > United States > California > San Francisco County > San Francisco (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)

Add feedback

Breaking Audio CAPTCHAs

Tam, Jennifer, Simsa, Jiri, Hyde, Sean, Ahn, Luis V.

Neural Information Processing SystemsDec-31-2009

CAPTCHAs are computer-generated tests that humans can pass but current computer systems cannot. CAPTCHAs provide a method for automatically distinguishing a human from a computer program, and therefore can protect Web services from abuse by so-called "bots." Most CAPTCHAs consist of distorted images, usually text, for which a user must provide some description. Unfortunately, visual CAPTCHAs limit access to the millions of visually impaired people using the Web. Audio CAPTCHAs were created to solve this accessibility issue; however, the security of audio CAPTCHAs was never formally tested.

audio captcha, captcha, classifier, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
North America > United States > California > San Francisco County > San Francisco (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)

Add feedback

Breaking Audio CAPTCHAs

Tam, Jennifer, Simsa, Jiri, Hyde, Sean, Ahn, Luis V.

Neural Information Processing SystemsDec-31-2009

CAPTCHAs are computer-generated tests that humans can pass but current computer systems cannot. CAPTCHAs provide a method for automatically distinguishing a human from a computer program, and therefore can protect Web services from abuse by so-called "bots." Most CAPTCHAs consist of distorted images, usually text, for which a user must provide some description. Unfortunately, visual CAPTCHAs limit access to the millions of visually impaired people using the Web. Audio CAPTCHAs were created to solve this accessibility issue; however, the security of audio CAPTCHAs was never formally tested.

artificial intelligence, captcha, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)

Add feedback